Piecewise convexity of artificial neural networks

نویسندگان

Blaine Rister

Daniel L. Rubin

چکیده

Although artificial neural networks have shown great promise in applications including computer vision and speech recognition, there remains considerable practical and theoretical difficulty in optimizing their parameters. The seemingly unreasonable success of gradient descent methods in minimizing these non-convex functions remains poorly understood. In this work we offer some theoretical guarantees for networks with piecewise affine activation functions, which have in recent years become the norm. We prove three main results. First, that the network is piecewise convex as a function of the input data. Second, that the network, considered as a function of the parameters in a single layer, all others held constant, is again piecewise convex. Third, that the network as a function of all its parameters is piecewise multi-convex, a generalization of biconvexity. From here we characterize the local minima and stationary points of the training objective, showing that they minimize the objective on certain subsets of the parameter space. We then analyze the performance of two optimization algorithms on multi-convex problems: gradient descent, and a method which repeatedly solves a number of convex sub-problems. We prove necessary convergence conditions for the first algorithm and both necessary and sufficient conditions for the second, after introducing regularization to the objective. Finally, we remark on the remaining difficulty of the global optimization problem. Under the squared error objective, we show that by varying the training data, a single rectifier neuron admits local minima arbitrarily far apart, both in objective value and parameter space.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of Gas Lift Parameters Using Artificial Neural Networks

متن کامل

Prediction the Return Fluctuations with Artificial Neural Networks' Approach

Time changes of return, inefficiency studies performed and presence of effective factors on share return rate are caused development modern and intelligent methods in estimation and evaluation of share return in stock companies. Aim of this research is prediction of return using financial variables with artificial neural network approach. Therefore, the statistical population of this study incl...

متن کامل

HYBRID ARTIFICIAL NEURAL NETWORKS BASED ON ACO-RPROP FOR GENERATING MULTIPLE SPECTRUM-COMPATIBLE ARTIFICIAL EARTHQUAKE RECORDS FOR SPECIFIED SITE GEOLOGY

The main objective of this paper is to use ant optimized neural networks to generate artificial earthquake records. In this regard, training accelerograms selected according to the site geology of recorder station and Wavelet Packet Transform (WPT) used to decompose these records. Then Artificial Neural Networks (ANN) optimized with Ant Colony Optimization and resilient Backpropagation algorith...

متن کامل

Prediction of breeding values for the milk production trait in Iranian Holstein cows applying artificial neural networks

The artificial neural networks, the learning algorithms and mathematical models mimicking the information processing ability of human brain can be used non-linear and complex data. The aim of this study was to predict the breeding values for milk production trait in Iranian Holstein cows applying artificial neural networks. Data on 35167 Iranian Holstein cows recorded between 1998 to 2009 were ...

متن کامل

Product Yields Prediction of Tehran Refinery Hydrocracking Unit Using Artificial Neural Networks

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Neural networks : the official journal of the International Neural Network Society

دوره 94 شماره

صفحات -

تاریخ انتشار 2017

Piecewise convexity of artificial neural networks

نویسندگان

چکیده

منابع مشابه

Prediction of Gas Lift Parameters Using Artificial Neural Networks

Prediction the Return Fluctuations with Artificial Neural Networks' Approach

HYBRID ARTIFICIAL NEURAL NETWORKS BASED ON ACO-RPROP FOR GENERATING MULTIPLE SPECTRUM-COMPATIBLE ARTIFICIAL EARTHQUAKE RECORDS FOR SPECIFIED SITE GEOLOGY

Prediction of breeding values for the milk production trait in Iranian Holstein cows applying artificial neural networks

Product Yields Prediction of Tehran Refinery Hydrocracking Unit Using Artificial Neural Networks

عنوان ژورنال:

اشتراک گذاری